A new speech enhancement: speech stream segregation

نویسندگان

  • Hiroshi G. Okuno
  • Tomohiro Nakatani
  • Takeshi Kawabata
چکیده

Speech stream segregation is presented as a new speech enhancement for automatic speech recognition. Two issues are addressed: speech stream segregation from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition. Speech stream segregation is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, and substituting non-harmonic residue for non-harmonic parts of groups. The main problem in interfacing speech stream segregation with HMM-based speech recognition is how to improve the degradation of recognition performance due to spectral distortion of segregated sounds, which is caused mainly by transfer function of a binaural input. Our solution is to re-train the parameters of HMM with training data binauralized for four directions. Experiments with 500 mixtures of two women's utterances of a word showed that the cumulative accuracy of word recognition up to the 10th candidate of each woman's utterance is, on average, 75%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Shuffled Sub-swarm Particle Swarm Optimization Algorithm for Speech Enhancement

In this paper, we propose a novel algorithm to enhance the noisy speech in the framework of dual-channel speech enhancement. The new method is a hybrid optimization algorithm, which employs the  combination of  the  conventional θ-PSO and the shuffled sub-swarms particle optimization (SSPSO) technique. It is known that the θ-PSO algorithm has better optimization performance than standard PSO al...

متن کامل

Challenge Problem for Computational Auditory Scene Analysis: Understanding Three Simultaneous Speeches

Understanding three simultaneous speeches is proposed as a challenge problem to foster arti cial intelligence, speech and sound understanding or recognition, and computational auditory scene analysis research. Automatic speech recognition under noisy environments is attacked by speech enhancement techniques such as noise reduction and speaker adaptation. However, the signal-to-noise ratio of sp...

متن کامل

Understanding Three Simultaneous Speeches

Understanding three simultaneous speeches is proposed as a challenge problem to foster artificial intelligence, speech and sound understanding or recognition, and computational auditory scene analysis research. Automatic speech recognition under noisy environments is attacked by speech enhancement techniques such as noise reduction and speaker adaptation. However, the signal-to-noise ratio of s...

متن کامل

Speech Enhancement by Modified Convex Combination of Fractional Adaptive Filtering

This paper presents new adaptive filtering techniques used in speech enhancement system. Adaptive filtering schemes are subjected to different trade-offs regarding their steady-state misadjustment, speed of convergence, and tracking performance. Fractional Least-Mean-Square (FLMS) is a new adaptive algorithm which has better performance than the conventional LMS algorithm. Normalization of LMS ...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996